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ABSTRACT 

The  vectors p  =  (p^,...,p^)'  representing  the  cell  proba¬ 
bilities  of- a  multinomial  distribution  are  partially  ordered 
according  to  the  majorization  relation:  p  majorizes  ?'(?  >  p ' ) 
if  liZ1  >_  -jli  ?'(!)/  j  =  l,...,k,  where  p(i)  denotes 

the  ith  largest  value  among  p^,...,pv.  If  the  reverse  inequality 
holds  we  say  that  p  minorizes  p'  (p  ^  ?') .  In  this  paper  we 

Q 

consider  a  test  of  the  hypothesis  H:  p  p  against  t.ne  alternative 
hypothesis  H ' :  p  >»  p° ,  where  p°  is  a  given  vector.  The  test 
discriminates  between  the  situations  where  the  total  multinomial 
probability  is  distributed  more  or  less  evenly  among  the  k  cells. 

It  is  therefore  called  a  polarization  test. 
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1.  Introduction .  Let  M(x;  p,  n)  denote  the  multinomial 


distribution,  where  x  =  (x^,...,x^)'  denotes  the  vector  of  cell 

probabilities,  p  =  (p^,...,p^)'  denotes  the  vector  of  cell 
k  k 

frequencies,  Z.  ,  x.  =  n  and  Z.  ,  p.  =  1.  Consider  a  oartial 
1=1  i  i=l  *1 

ordering  of  the  probability  vectors  p,  given  by  the  majorization 


relation:  p  majorizes  p'  (p  >  p')  if  Z.jj.p...  >  ~._.p 


i=lp(i)  -  i=l?  (1;  ' 


j  =  l,...,k,  where  p^  denotes  the  ith  largest  value  among 
p^,...,p^.  If  the  reverse  inequality  holds  we  say  that  p 
minorizes  ?’  (p  -<  ?')•  A  symmetric  function  f  is  said  to  be 

Schur-convex  if  f(p)  >_  f(p')  for  all  p  >■  p'.  For  example, 

-  k  2  1 

Q  =  - • _ i  p.  is  a  Schur-convex  runction.  Clearly,  -  <  Q  <  1. 

If  the  k  multinomial  events  are  nearly  equally  probable  then 

the  value  of  Q  is  close  to  its  lower  bound.  On  the  other  hand 

if  the  total  probability  is  almost  concentrated  into  a  single 


cell  then  the  value  of  Q  is  close  to  1.  Thus,  the  value  of  2 
measures,  so  to  speak ,  "polarization" of  the  multinomial  distribu¬ 


tors  generally,  the  multinomial  distribution  associated  with 
p  is  said  to  be  more  polarized  than  the  multinomial  distribution 

.  1  i 

associated  with  p'  ir  p>  p'  .  >;ote  that  the  vector  i,-,  .  .  .  ,A)  is 

majorized  by  every  vector  p.  In  this  paper  we  consider  a  test 

of  the  hypothesis,  H  :  p  ^  p°  against  the  alternative  hypothesis 

H’:  p  }•  p°,  where  p°  is  a  given  value  of  p.  The  test  is  based 

-  k  2 

on  the  statistic  T  =  (  ..  .  x . } / n ,  rejecting  M  ror  iarc?e  values 

i=r  i 

of  T.  We  call  it  a  polarization  test.  We  note  that  the  polari¬ 
zation  test  is  a  one-sided  test,  whereas,  Pearson's  Chi-scuare 


tor  '.-oodness 


is  two-sided,  designed  to  test  the  nypot.n 


nst  the  alterr.at  i"  a  hvnc the 


(2) 


k  =  2  the  hypothesis  H  states  that  “  P2  I  £  '  P°  -  P^  '  w^-*-e 

the  reverse  inequality  holds  for  H*. 

The  problem  of  testing  H  against  H'  arises  in  various  sit¬ 
uations.  Suppose,  for  example,  that  k  political  parties  are 
contesting  in  an  election.  Let  p^  denote  the  proportion  of 
voters  in  favor  of  the  ith  party  (i  =  l,...,k)  at  a  certain 
period  of  time  before  the  election.  It  might  be  of  interest 
to  know  at  a  subsequent  period  of  time  before  the  election 
whether,  due  to  the  emergence  of  certain  issue  or  the  occurrence 
of  certain  event,  the  voting  preference  had  polarized  in  the 
sense  that  a  single  party  or,  at  most,  a  few  parties  out  of  the 
k  parties  would  share  together  almost  all  the  votes. 

For  another  example,  suppose  that  the  population  of  a  variet 
of  fish  is  spread  out  in  certain  parts  of  a  lake.  It  might  be  of 
interest  to  know  whether  the  fish  population  had  concentrated 
into  fewer  parts  of  the  lake  at  a  certain  time,  that  is,  the  fish 
population  had  polarized  due  to  a  change  in  weather  condition  or 
some  other  factor. 

In  the  following  section  it  is  shown  that  the  given  test  is 
unbiased.  For  the  application  of  the  test  we  need  to  know  the 
distribution  of  T.  Formulas  for  exact  as  well  as  the  asymptotic 
distribution  for  large  n  are  given. Numerical  results  are  given 
showing  asymptotic  convergence  of  the  distribution. 

2.  Polarization  test.  Consider  the  hypothesis  H.  We  rejec 
H  for  large  values  of  T.  By  Theorem  3.7  of  Hollander,  Froschan 
and  Sethuranan  (1977) 

?  T  ^  t  p  - 


(3) 


is  Schur-convex  function  of  o  for  any  positive  number  t.  There¬ 
fore,  the  polarization  test  is  unbiased.  For  the  application 
of  the  theorem  note  that  T  is  a  Schur-convex  function  of  x  and 
that  the  multinomial  probability  function  satisfies  the  condi¬ 
tion  required  for  ;(\,x)  in  the  theorem. 

For  the  application  of  the  test  we  need  to  find  the  distri¬ 


bution  of  T.  First  we  consider  the  exact  distribution  of  T. 


Let  [x]  denote  the  smallest  non-negative  integer  >  x,  and  [x] 


denote  the  smallest  integer  <  x.  Let  D,  (t;p. , . . . ,o.  ,n) 

—  K.  X  *■  K. 


=  P{T  <  t}  denote  the  cumulative  distribution  function  (cdf)  oi 
T  for  k  >  2.  The  cdf  is  recursively  given  by 


(2.1) 


i 0 ,  t  <n/2 
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The  recursive  relation  is  based  on  the  fact  that  the  conditional 

p. 


distribution  of  u  =  (x^,.... 


x,  .  )  ,  given  x,_ ,  is  MCJ;^ 


o. 


o, 


k-  1 


n  - 


liext  we  consider  the  asymptotic  distribution  of  T  for  large 
The  covariance  matrix  of  x  is  nl  =  n(D  -  p  p  *  )  ,  where  prime 

i a notes  the  transpose  and  D  denotes  the  diagonal  matrix  with  ith 


(4) 


diagonal  element  equal  to  p^.  Let  \  y  denote  the  non-2ero 

eigen  value  of  I.  An  eigen  vector  corresponding  to  the  zero 
eigen  value  is 


=  (1,1, ... ,1) 


Let  P  denote  an  orthogonal  matrix  diagonali¬ 


zing  I  whose  first  row  is  equal  to  (4=-  ,  .  .  .  ,^=)  and  let  v  =  P  x. 

t'  k  2k  ~ 

The  first  component  of  y  is  equal  to  .  Let  the  mean  of  y  be 

/k 

written  as 


(2.3) 


E  y  = 


,n 

1  vie 1 


k-1  "k-l; 


=  n  ?  p. 


Fran  (2.3)  we  have 


(2.4) 


r  +  }  3-  5?  =  p'P'PD=  o'd  =  Q 

k  -i=l  iii  7  -■ 


(2.5) 


r k-1  , 2  ,2 


-i=l  "  l  i 


r  =  p’  p' (p:p*)p  p 


=  p'  (D  -  p  p')p 


where  Q- 


=  Ql  -  Q 

-  i  •  It  is  easy  to  see  that  _>  Q2  • 


We  have  n  T  =  x1  x  =  y'  y1 .  Since  x  is  asymptotically  distributed  accordinc 
the  multivariate  normal  distribution  with  mean  no  and  covariance  nf,  we 


1.6) 


T  °  t 


z2 

1=1  i  i 


•./here  “  means  "asymptotically  distributed  as"  and  Z4  is  normally  distributed 


with  variance  1  and  mean  equal  to  >n  Vorecver  Z,  , . . .  ,Z,,  ,  are  independent. 

1  a-  i- 

09  2 

Let  2.  •  Since  Z~  is  distributed  as  7  -  (non-central 

1  1  1  r  n  '  • 


rni-scua; 


iith  i  degree  of  freedom  and  non-c 
The  moment  genera tin  x  f  un otic: 


iranty  par 


:er 


(5) 


(2.7) 


M(t)  =  E  etR 


,  rk-1 
=  enp(n  L±=1 


1-2  a  .  t 
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k-1 

i=l 


t  <  0. 


From  (2.7)  it  is  seen  that 


(2.8) 


tn 


(R  _  vk-1 
'n  Li=l 


\  . 
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-> 


where  V  denotes  a  standard  normal  random  variable.  From  (2.4) , 
(2.5),  (2.6)  and  (2.8)  we  have  that 


(2.9) 


Hr5  («i  -  22)''- f  v. 


Suppose  that  -  Q  =0.  Two  cases  arise:  (i)  =  p?  = 

and  (ii)  ?iP_j  =  0  for  i  =  j.  In  case  (i)  we  have  _  =  '•.,= 


k_1  =  and  from  (2.3)  :  ^ 


'k-1 


=  0  .  Therefore , 


a  2 

Z.  is  distributed  as  .  ,  (central  chi-scruare  with  1  decree  of  freedo 

i  -1 

and  (2.6)  we  have 


(2.13) 


"Q ) 


d  2 


k-1 ' 


In  case  , ii)  we  have  T  =  n  with  probability  1. 

bat  2°  and  denote  the  values  of  Q  and  Q  ,  respectively, 
for  p  =  p° ,  and  let  V  denote  the  upper  j  -  quantile  of  the  standar 
normal  distribution.  Let  T  denote  the  critical  value  of  the  polar 
ization  test  for  a  level  of  significance  equal  to  , ,  as  derived  fro 
'2.1;  and  (2.2; .  From  (2.9)  the  value  of  T  for  large  n  is  apprcx- 
m.atel"  riven  bv 

2.11)  T  =  n  2°  +  2  n  ( gy  -  (  2° )  “  '■  -  v 


(6) 


Also,  the  asymptotic  power  of  the  polarization  test  is  equal  to 


>n(Q  -  Q°) 
2 (Q1  -  Q2) 


V,) 


where  $  denotes  the  standard  normal  cdf.  For  Q  -  Q°  =  %=  ,  where 

v  n 

c  is  given  positive  number,  the  asymptotic  power,  that  is,  the 
Pitman  efficiency  is  equal  to  v (j(Q°  -  (Q°)“)  Va) . 

In  order  to  compare  the  asymptotic  formula  with  the  exact 
formula  we  show  in  the  table  below  values  of  P  T  _  T;  ,  derived 
from  (2.1)  and  (2.2),  where  T  is  given  by  (2.11)  for  =  .95, 

Q 

k  =  2,  3,  4  and  certain  values  of  n  and  p  .  It  is  seen  from  the  table 
that  n  =  100  is  not  sufficiently  large  for  the  asymptotic  probabil¬ 
ity  to  match  with  the  exact  probability.  It  is  interesting  to 
observe  that  the  figures  in  columns  2,  3  and  5  agree  except  for  one 
entry.  We  have  checked  the  figures  given  in  the  table  with  the 
result  obtained  from  a  simulation  study. 

Values  of  P  ■. T  <  T  n  -  ; 
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